Aspect Based Sentiment Analysis is a dominant research area with potential applications in social media analytics, business, finance, and health. Prior works in this area are primarily based on supervised methods, with a few techniques using weak supervision limited to predicting a single aspect category per review sentence. In this paper, we present an extremely weakly supervised multi-label Aspect Category Sentiment Analysis framework which does not use any labelled data. We only rely on a single word per class as an initial indicative information. We further propose an automatic word selection technique to choose these seed categories and sentiment words. We explore unsupervised language model post-training to improve the overall performance, and propose a multi-label generator model to generate multiple aspect category-sentiment pairs per review sentence. Experiments conducted on four benchmark datasets showcase our method to outperform other weakly supervised baselines by a significant margin.
translated by 谷歌翻译
计算幽默检测系统很少对幽默反应的主观性进行建模,或者考虑对幽默的替代反应 - 即犯罪。我们分析了不同年龄段的男性和女性注释者的大量幽默和犯罪评级数据集。我们发现女性比男性更强烈地联系这两个概念,她们倾向于给出较低的幽默评分和更高的进攻得分。我们还发现,幽默与犯罪之间的相关性随着年龄的增长而增加。尽管幽默发现没有性别或年龄差异,但女性和较旧的注释者表示,她们比男性更频繁地理解笑话文本。我们讨论对计算幽默检测和下游任务的影响。
translated by 谷歌翻译
Achieving artificially intelligent-native wireless networks is necessary for the operation of future 6G applications such as the metaverse. Nonetheless, current communication schemes are, at heart, a mere reconstruction process that lacks reasoning. One key solution that enables evolving wireless communication to a human-like conversation is semantic communications. In this paper, a novel machine reasoning framework is proposed to pre-process and disentangle source data so as to make it semantic-ready. In particular, a novel contrastive learning framework is proposed, whereby instance and cluster discrimination are performed on the data. These two tasks enable increasing the cohesiveness between data points mapping to semantically similar content elements and disentangling data points of semantically different content elements. Subsequently, the semantic deep clusters formed are ranked according to their level of confidence. Deep semantic clusters of highest confidence are considered learnable, semantic-rich data, i.e., data that can be used to build a language in a semantic communications system. The least confident ones are considered, random, semantic-poor, and memorizable data that must be transmitted classically. Our simulation results showcase the superiority of our contrastive learning approach in terms of semantic impact and minimalism. In fact, the length of the semantic representation achieved is minimized by 57.22% compared to vanilla semantic communication systems, thus achieving minimalist semantic representations.
translated by 谷歌翻译
This work addresses the problems of (a) designing utilization measurements of trained artificial intelligence (AI) models and (b) explaining how training data are encoded in AI models based on those measurements. The problems are motivated by the lack of explainability of AI models in security and safety critical applications, such as the use of AI models for classification of traffic signs in self-driving cars. We approach the problems by introducing theoretical underpinnings of AI model utilization measurement and understanding patterns in utilization-based class encodings of traffic signs at the level of computation graphs (AI models), subgraphs, and graph nodes. Conceptually, utilization is defined at each graph node (computation unit) of an AI model based on the number and distribution of unique outputs in the space of all possible outputs (tensor-states). In this work, utilization measurements are extracted from AI models, which include poisoned and clean AI models. In contrast to clean AI models, the poisoned AI models were trained with traffic sign images containing systematic, physically realizable, traffic sign modifications (i.e., triggers) to change a correct class label to another label in a presence of such a trigger. We analyze class encodings of such clean and poisoned AI models, and conclude with implications for trojan injection and detection.
translated by 谷歌翻译
In this research work, we have demonstrated the application of Mask-RCNN (Regional Convolutional Neural Network), a deep-learning algorithm for computer vision and specifically object detection, to semiconductor defect inspection domain. Stochastic defect detection and classification during semiconductor manufacturing has grown to be a challenging task as we continuously shrink circuit pattern dimensions (e.g., for pitches less than 32 nm). Defect inspection and analysis by state-of-the-art optical and e-beam inspection tools is generally driven by some rule-based techniques, which in turn often causes to misclassification and thereby necessitating human expert intervention. In this work, we have revisited and extended our previous deep learning-based defect classification and detection method towards improved defect instance segmentation in SEM images with precise extent of defect as well as generating a mask for each defect category/instance. This also enables to extract and calibrate each segmented mask and quantify the pixels that make up each mask, which in turn enables us to count each categorical defect instances as well as to calculate the surface area in terms of pixels. We are aiming at detecting and segmenting different types of inter-class stochastic defect patterns such as bridge, break, and line collapse as well as to differentiate accurately between intra-class multi-categorical defect bridge scenarios (as thin/single/multi-line/horizontal/non-horizontal) for aggressive pitches as well as thin resists (High NA applications). Our proposed approach demonstrates its effectiveness both quantitatively and qualitatively.
translated by 谷歌翻译
基于会话的推荐系统(SBRS)表现出优于常规方法的性能。但是,它们在大规模工业数据集上显示出有限的可伸缩性,因为大多数模型都会学习一个嵌入每个项目。这导致了巨大的记忆要求(每项存储一个矢量),并且在稀疏的会话上具有冷启动或不受欢迎的项目的性能差。使用一个公共和一个大型工业数据集,我们在实验上表明,最先进的SBRS在稀疏项目的稀疏会议上的性能较低。我们提出了M2TREC,这是一种基于会话建议的元数据感知的多任务变压器模型。我们提出的方法学习了从项目元数据到嵌入的转换函数,因此是免费的(即,不需要学习一个嵌入每个项目)。它集成了项目元数据以学习各种项目属性的共享表示。在推论期间,将为与先前在培训期间观察到的项目共享的属性分配新的或不受欢迎的项目,因此将与这些项目具有相似的表示,从而使甚至冷启动和稀疏项目的建议。此外,M2TREC接受了多任务设置的培训,以预测会话中的下一个项目及其主要类别和子类别。我们的多任务策略使该模型收敛更快,并显着改善了整体性能。实验结果表明,使用我们在两个数据集中稀疏项目上提出的方法进行了显着的性能增长。
translated by 谷歌翻译
本文考虑通过模型量化提高联邦学习(FL)的无线通信和计算效率。在提出的Bitwidth FL方案中,Edge设备将其本地FL模型参数的量化版本训练并传输到协调服务器,从而将它们汇总为量化的全局模型并同步设备。目的是共同确定用于本地FL模型量化的位宽度以及每次迭代中参与FL训练的设备集。该问题被视为一个优化问题,其目标是在每卷工具采样预算和延迟要求下最大程度地减少量化FL的训练损失。为了得出解决方案,进行分析表征,以显示有限的无线资源和诱导的量化误差如何影响所提出的FL方法的性能。分析结果表明,两个连续迭代之间的FL训练损失的改善取决于设备的选择和量化方案以及所学模型固有的几个参数。给定基于线性回归的这些模型属性的估计值,可以证明FL训练过程可以描述为马尔可夫决策过程(MDP),然后提出了基于模型的增强学习(RL)方法来优化动作的方法选择迭代。与无模型RL相比,这种基于模型的RL方法利用FL训练过程的派生数学表征来发现有效的设备选择和量化方案,而无需强加其他设备通信开销。仿真结果表明,与模型无RL方法和标准FL方法相比,提出的FL算法可以减少29%和63%的收敛时间。
translated by 谷歌翻译
诸如GPT-3之类的大型审慎模型通过利用自学学习的学习来学习明显的表现,从而对现代自然语言处理产生了巨大影响,这些表现可以轻易地对各种下游任务进行挑剔。我们通过使用微笑语言构建化学基础模型Chemberta-2来研究将这种进步转移到分子机器学习中的可能性。虽然标记的分子预测任务数据通常很少,但微笑字符串的库很容易获得。在这项工作中,我们通过优化预处理过程来建立Chemberta。我们比较了通过不同的超参数和预处理数据集尺寸的多任务和自我监督预训练的预测,来自PubChem最多77m化合物。据我们所知,77m集合构成了迄今为止用于分子预处理的最大数据集之一。我们发现,通过这些预处理的改进,我们与Moleculenet基准套件上现有的最先进的体系结构具有竞争力。我们分析了预读的改进的程度,转化为下游任务的改进。
translated by 谷歌翻译
大脑计算机界面(BCI)具有解决许多大脑信号分析局限性,精神障碍分辨率以及通过神经控制的植入物恢复缺失的肢体功能的巨大潜力。但是,尚无单一可用,并且存在安全的日常生活使用情况。大多数拟议的植入物都有多个实施问题,例如感染危害和散热,这限制了它们的可用性,并使通过法规和质量控制生产更具挑战性。无线植入物不需要颅骨慢性伤口。但是,当前植入物芯片内部的复杂聚类神经元识别算法消耗了大量功率和带宽,从而导致更高的散热问题并排出植入物的电池。尖峰分类是侵入性BCI芯片的核心单位,在功耗,准确性和区域中起着重要作用。因此,在这项研究中,我们提出了一个低功率自适应的简化VLSI体系结构,“ Zydeco风格”,用于BCI Spike Sorting,在最坏情况下,计算上的计算较差,其精度较高,高达93.5%。该体系结构使用带有外部物联网医疗ICU设备的低功率蓝牙无线通信模块。在Verilog中实现并模拟了所提出的架构。此外,我们正在提出植入概念设计。
translated by 谷歌翻译
了解人类的行为和监测心理健康对于维持社区和社会的安全至关重要。由于不受控制的心理健康,由于心理健康期间,由于心理健康的大流行期间的心理健康问题有所增加,因此对心理问题的早期发现至关重要。如今,智能虚拟个人助理(IVA)的使用已在全球范围内增加。个人使用声音来控制这些设备以满足请求并获得不同的服务。本文提出了一种基于封闭式复发性神经网络和卷积神经网络的新型深度学习模型,以了解人类的情感从语音中,以改善其IVA服务并监控其心理健康。
translated by 谷歌翻译